Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: match before checking cache #447

Merged
merged 1 commit into from
Oct 14, 2024
Merged

Conversation

brianmcgee
Copy link
Member

@brianmcgee brianmcgee commented Oct 13, 2024

This changes the flow of processing to make unmatched behaviour more consistent.

Before, we had been:

  • traversing the filesystem
  • comparing with the cache and only emitting files which had changed
  • applying the matching rules to determine which formatters should be applied to a given file
  • applying the formatters

Now, we do the following:

  • traverse the filesystem
  • apply the matching rules to determine which formatters should be applied to a given file
  • compare with the cache and only emit files which have changed for formatting
  • apply the formatters

It does mean we are applying the matching rules against files which we may not have to format, but in testing against Nixpkgs the performance impact appears negligible.

This makes sense since most of the processing time will be spent in the formatters, not applying some globs to file paths.

Close #433

@brianmcgee
Copy link
Member Author

I just realized I can further simplify the stats since matched and formatted will always be the same.

@brianmcgee brianmcgee force-pushed the feat/refine-on-unmatched branch from 0e8ffff to 6bc953f Compare October 13, 2024 13:05
Copy link
Collaborator

@jfly jfly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thanks for making the change =)

I just realized I can further simplify the stats since matched and formatted will always be the same.

Sorry, I think I'm not understanding the vocabulary. Here's what I thought:

  • matched: the number of files in the repo that treefmt "has opinions" about
  • formatted: of the files we matched, the number of files that treefmt invoked a formatter on. caching means we may choose not to format a matched file (because we know it hasn't changed).
  • changed: of the files we formatted, how many of them actually materially changed

cmd/format/format.go Show resolved Hide resolved
@brianmcgee
Copy link
Member Author

  • matched: the number of files in the repo that treefmt "has opinions" about
  • formatted: of the files we matched, the number of files that treefmt invoked a formatter on. caching means we may choose not to format a matched file (because we know it hasn't changed).
  • changed: of the files we formatted, how many of them actually materially changed

This is a good summary.

@jfly
Copy link
Collaborator

jfly commented Oct 13, 2024

Sorry, I didn't get to my point, haha:

since matched and formatted will always be the same.

Is this true? When we have cache hits, won't we have files that we match that we end up not formatting?

@brianmcgee
Copy link
Member Author

Sorry, I didn't get to my point, haha:

since matched and formatted will always be the same.

Is this true? When we have cache hits, won't we have files that we match that we end up not formatting?

You're not wrong. Now that I look at this again, I have no idea why I removed the distinction. I think I need to take a break from treefmt for a few days after this.

Putting it back in.

@brianmcgee brianmcgee force-pushed the feat/refine-on-unmatched branch 2 times, most recently from 7b88088 to 080888c Compare October 13, 2024 15:40
@brianmcgee brianmcgee requested a review from jfly October 13, 2024 15:44
@brianmcgee brianmcgee mentioned this pull request Oct 13, 2024
1 task
Copy link
Collaborator

@jfly jfly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

cmd/format/format.go Show resolved Hide resolved
cmd/format/format.go Outdated Show resolved Hide resolved
@brianmcgee brianmcgee force-pushed the feat/refine-on-unmatched branch from 080888c to 52ac241 Compare October 14, 2024 09:04
This changes the flow of processing to make unmatched behaviour more consistent.

Before, we had been:

- traversing the filesystem
- comparing with the cache and only emitting files which had changed
- applying the matching rules to determine which formatters should be applied to a given file
- applying the formatters

Now, we do the following:

- traverse the filesystem
- apply the matching rules to determine which formatters should be applied to a given file
- compare with the cache and only emit files which have changed for formatting
- apply the formatters

It does mean we are applying the matching rules against files which we may not have to format, but in testing against Nixpkgs the performance impact appears negligible.

This makes sense since most of the processing time will be spent in the formatters, not applying some globs to file paths.

Signed-off-by: Brian McGee <[email protected]>
@brianmcgee brianmcgee force-pushed the feat/refine-on-unmatched branch from 52ac241 to 71b262f Compare October 14, 2024 09:08
@brianmcgee brianmcgee merged commit a14e1b3 into main Oct 14, 2024
27 checks passed
@brianmcgee brianmcgee deleted the feat/refine-on-unmatched branch October 14, 2024 09:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

treefmt cache results in warnings getting suppressed
2 participants